Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework Key Update #2237

Closed
wants to merge 23 commits into from
Closed

Rework Key Update #2237

wants to merge 23 commits into from

Conversation

martinthomson
Copy link
Member

@martinthomson martinthomson commented Dec 21, 2018

This section was pretty old and it had at least one bug, along with a bunch of editorial lack of clarity.

Substantively, this does two things:

  1. It specifies what label to use for the KDF when updating. The agreement at one time was that we would use a quic-specific label, but that never really got captured properly. This proposes "quic ku".

  2. It takes another bit to signal whether a key update is permitted. As noted in simultaneous key update can lead to deadlock #2214, simultaneous updates can result in a deadlock if updates happen too quickly, so endpoints can use this bit to limit the rate at which updates occur.

(Note: this second point was changed based on feedback about the original PR.)

Closes #2214.

This section was pretty old and it had at least one bug, along with a
bunch of editorial lack of clarity.

Substantively, this does two things:

1. It specifies what label to use for the KDF when updating.  The
agreement at one time was that we would use a quic-specific label, but
that never really got captured properly.  This proposes "quic ku".

2. It specifies that acknowledgments of packets with a given key phase
are necessary before another update can be triggered.  As noted in
 #2214, simultaneous updates can result in a deadlock if updates happen
too quickly.  This has the nice effect of allowing update compliance to
be more rigorously tested.

Closes #2214.
@ekr
Copy link
Collaborator

ekr commented Dec 21, 2018

Hmm... I thought we had agreed that we were never going to use acknowledgements as side effects and that's why we rejected the DTLS-style KeyUpdate. I had assumed that instead you were waiting for the other side to change its phase.

Copy link
Contributor

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @ekr. At least in my implementation, the coupling between acknowledgements and key management would create a lot of additional complexity.
I proposed a different solution in #2214 (comment).

An endpoint MUST NOT initiate more than one key update at a time. A new key
cannot be used until the endpoint has received an acknowledgment for a packet it
sends with the new keys. An endpoint that receives a packet protected with old
keys that includes an acknowledgement for a packet protected with newer keys MAY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this error condition. Are you implying that key updates effectively create separate packet number spaces, and ACKs are only valid for the space that they're sent with?
I know that there was a lot of discussion about which packets to include in an ACK frame, but as a high-level strategy it should be allowed to include all packets received within the last one to two RTTs. This would lead to ACK frames that span multiple key phases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the point of this is that you requested a key update and the peer didn't comply. This is how we ensure that endpoints update when requested.

@martinthomson
Copy link
Member Author

We can switch to a two-bit epoch if you like. That addresses the problem. This is the easiest change to specify, but if implementing a two-bit epoch is easier, we have the bits available, it's totally possible.

@marten-seemann
Copy link
Contributor

The thing I like about a two-bit epoch is that it's easy to implement frequent key updates. My hope is that implementation will rotate keys a lot more frequently than the theoretically safe levels - otherwise key rotation will remain an obscure function of the protocol that's only used on 0.01% of the connections, and we all know what that means for interop.

@martinthomson
Copy link
Member Author

The number of bits we use to signal key update doesn't have any effect on how often it will be used. The reason I like it is that it avoids a problem we didn't realize we had until now.

@marten-seemann
Copy link
Contributor

I think it does, since the issue I described depends on all packets of a certain key phase being lost. If you only do key updates for example every 1 million packets, the probability of that would be negligible.

@martinthomson
Copy link
Member Author

What is important is how often people want to update. And even without this bug, I doubt that it is going to be that often. That said, we'll probably eventually want to grease it.

@kazuho
Copy link
Member

kazuho commented Dec 21, 2018

If we are to use another bit, could we use that for indicating the newest KEY_PHASE bit that the peer used for sending?

@marten-seemann
Copy link
Contributor

@kazuho That would make key updates on both sides completely independent from each other, wouldn't it? i.e. I wouldn't have to update my (send) keys when the peer updates its keys.

@martinthomson martinthomson added design An issue that affects the design of the protocol; resolution requires consensus. -tls labels Dec 21, 2018
@kazuho
Copy link
Member

kazuho commented Dec 21, 2018

@marten-seemann IIUC, current design does not enforce that. It's just that the text tells you that an endpoint should update when the peer does. That can be the same.

@martinthomson
Copy link
Member Author

@kazuho, As a practical matter, spending the two bits on epoch gives us more flexibility. You could, if you wanted, have multiple updates in flight with two bits (up to 3 updates then). If you echo a bit, you do fix the bug, but get less out of it.

@marten-seemann, you could, but then you lose the ability to force the other side to update. That's a requirement we discovered in TLS. Say that I know that AES-128-GCM is weak and needs updating every 1 million packets, as opposed to the 200 million or so that we currently understand to be the limit. I can use the current scheme to cause you to update even if you don't know that.

@kazuho
Copy link
Member

kazuho commented Dec 21, 2018

@martinthomson

As a practical matter, spending the two bits on epoch gives us more flexibility. You could, if you wanted, have multiple updates in flight with two bits (up to 3 updates then). If you echo a bit, you do fix the bug, but get less out of it.

I agree. OTOH, assuming that we wouldn't be updating the keys too often, I would prefer the least complex approach on solving the issue (assuming that we agree on spending two bits). And to me it seems that having an explicit signal that indicates that the peer has received the key update might be the simplest.

PS. FWIW, I'm perfectly fine with using ACK as a signal.

@martinthomson
Copy link
Member Author

I think that checking send and receive epoch for equality is simple enough. It's just that N and N+2 would appear to be the same, and two bits fixes that.

to 0 and then inverted with each key update.
possible to update the keys used to protect packets. The Key Phase bits in the
short header are used to indicate whether key updates have occurred. The
Key Phase is initially set to 0 and then incremented with each key update.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremented % 4

decrypt the packet that contains the changed bit.
The Key Phase allows a recipient to detect a change in keying material
without needing to receive the first packet that triggered the change. An
endpoint that notices a changed Key Phase bit can update keys and decrypt the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to be able to force the peer to update its keys, shouldn’t this be a MUST?

what it is expecting. The endpoint creates a new read secret and the
corresponding read key and IV using the same process as its peer.

A packet with a Key Phase other than the expected or next value MUST be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous one is ok as well, as long as it’s sent with a smaller packet number.

@kazuho
Copy link
Member

kazuho commented Dec 21, 2018

@martinthomson

I think that checking send and receive epoch for equality is simple enough. It's just that N and N+2 would appear to be the same, and two bits fixes that.

Yeah IMO it's a bit unfortunate that an endpoint might need to update the receiving key twice at once, but I agree it works.

@marten-seemann
Copy link
Contributor

It’s not really updating its key twice at once. The situation where you receive a packet encrypted with N+2 only occurs if both peers already did the simultaneous key update, which means the current key will be N+1 already.

@kazuho
Copy link
Member

kazuho commented Dec 22, 2018

My point is that a receiver needs to assume that it would be receiving either N, N+1, N+2, instead of just assuming that it would receive one of the two consecutive keys.

Doesn't that make the state machine complicated?

We have the following provisions:

  • "An endpoint SHOULD retain old read keys for a period of no more than three times the Probe Timeout (PTO, see {{QUIC-RECOVERY}})."
  • "Packets with higher packet numbers always use the updated keys and MUST NOT be decrypted with old keys."

The proposed change of allowing N+2 encourages an endpoint to maintain the states for three generations of the keys to implement this behavior, instead of just keeping information for two generations.

My preference goes to a design that discourages (or disables) frequent key updates, so that keeping states for two generations is guaranteed to be sufficient.

@kazuho
Copy link
Member

kazuho commented Dec 22, 2018

(following my previous comment) As an example, a design like below ensures that keeping states for two generation of keys is fine and also encourages the endpoint to drop the older generation key at the correct moment.

  • We use one bit for send key phase, and another bit for receive key phase.
  • When the sender updates the send key, it's flips the send key phase bit.
  • When the receiver retires the old receive key, it flips the receive key phase bit.
  • Sender's send key phase bit becoming equal to the peer's receive key phase bit allows the sender to make another key update; until that happens, a sender is not allowed to do a key update.

@martinthomson
Copy link
Member Author

A "May Update" bit. I can see people setting that to a value that prevents key update, but it does have the benefit of signaling when updates are possible in a way that keeps a cap on the number of active keys. Taking into account all other considerations, I think that is better. No time to act on that today, but we'll get to it.

@marten-seemann
Copy link
Contributor

A "May Update" bit. I can see people setting that to a value that prevents key update

Why is this better? I'm not sure if keeping 2 keys around is so much better than having short periods of time (3 PTO) where you keep 3 keys.
Since key updates will happen rarely in practice, I think we should design a mechanism that actually forces a key update, not one that allows a lazy implementation to prevent a key update from happening in the first place.

@kazuho
Copy link
Member

kazuho commented Dec 26, 2018

@marten-seemann We cannot enforce implementors to code correctly.

The issue with the 2-bit KEY_PHASE design is that the receiver has a code path that is very rarely being executed (retaining 3 keys at once). Some stacks might fail to implement it correctly. Or some stacks might not implement key updates correctly at all. Anyways, we might see interoperability issues in the wild, and the bug would be hard to identify, because it's rare and because it would look like an connection stalling to the end user.

"MAY_UPDATE" bit takes a different approach. We do not have a code path that is rarely taken on the receiver side. Hence less chance of having interoperability issues. A receiver that does not allow the sender to update keys can be detected, and then the sender can close the connection with a specific error code. That is much better than the connection silently breaking.

@martinthomson
Copy link
Member Author

At this stage, I'm going to put this on the agenda for discussion in Tokyo. I think that we have three valid options, though we might have ruled the first out:

  1. use acknowledgements to regulate multiple updates
  2. use two phase bits, representing a cyclic sequence (0, 1, 2, 3) and allow two updates ahead
  3. use one phase bit and a second bit indicating whether key updates are permitted

1 is theoretically possible, but likely to be impractical in many implementations, so we can rule that out.

This PR is currently 2, though I'm biased slightly toward 3 now. It doesn't have an error condition where the phase is 3 ahead of expected, but the packet number is higher. More importantly, it reduces the number of active keys after the handshake is complete to at most 2. The drawback is that you can block key updates. That said, some people might consider that to be a feature.

Copy link
Member

@nibanks nibanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I fully support this PR. Just some small nits for the most part.

indicate whether key updates have occurred. The KEY_PHASE bit is initially set
to 0 and then inverted with each key update.
Once the 1-RTT keys are established and confirmed through the use of the
KEYS_READY frame, it is possible to update the keys used to protect packets.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, the name KEYS_READY seems to indicate that the keys are about to be used, not that they are in use now. I think I'd prefer something like KEYS_ACTIVE or KEY_IN_USE. Just a personal preference though.

@@ -1259,14 +1259,14 @@ Client Server
Initial[0]: CRYPTO[CH] ->

Initial[0]: CRYPTO[SH] ACK[0]
Handshake[0]: CRYPTO[EE, CERT, CV, FIN]
Handshake[0]: KEYS_READY, CRYPTO[EE, CERT, CV, FIN]
<- 1-RTT[0]: STREAM[1, "..."]

Initial[1]: ACK[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this Initial necessary? Assuming it is coalesced with the Handshake that contains KEYS_READY, wouldn't that cause the Initial keys to be thrown away, implicitly ACK'ing outstanding Initial packets?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. The client might send this because it processes the Initial before it processes the KEYS_ACTIVE frame, but it doesn't have to send it.

## KEYS_READY Frame {#frame-keys-ready}

An endpoint sends a KEYS_READY frame (type=0x1e) to signal that it has installed
keys for reading and writing packets. Receipt of this frame in a packet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
keys for reading and writing packets. Receipt of this frame in a packet
keys for both reading and writing packets. Receipt of this frame in a packet

To be a bit more clear, it seems you are only supposed to send this when you are both keys are ready for use, not one or the other, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both. And the clarification is good.


1-RTT[1]: STREAM[55, "..."], ACK[0]
1-RTT[1]: KEYS_READY, STREAM[55, "..."], ACK[0]
<- Handshake[1]: ACK[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly with this Handshake packet. Is it necessary, assuming it was coalesced with the 1-RTT packet? On a side note, if we don't get rid of it, I think it might be better to put the Handshake packet before the 1-RTT packet in the list, since that's the order they'd be coalesced (ditto for the 0-RTT example below).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll remove it on the basis that it's a nice optimization.

Copy link
Member Author

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nick.

I don't care about the name, and KEYS_ACTIVE is technically more correct.

@@ -1259,14 +1259,14 @@ Client Server
Initial[0]: CRYPTO[CH] ->

Initial[0]: CRYPTO[SH] ACK[0]
Handshake[0]: CRYPTO[EE, CERT, CV, FIN]
Handshake[0]: KEYS_READY, CRYPTO[EE, CERT, CV, FIN]
<- 1-RTT[0]: STREAM[1, "..."]

Initial[1]: ACK[0]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. The client might send this because it processes the Initial before it processes the KEYS_ACTIVE frame, but it doesn't have to send it.


1-RTT[1]: STREAM[55, "..."], ACK[0]
1-RTT[1]: KEYS_READY, STREAM[55, "..."], ACK[0]
<- Handshake[1]: ACK[0]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll remove it on the basis that it's a nice optimization.

## KEYS_READY Frame {#frame-keys-ready}

An endpoint sends a KEYS_READY frame (type=0x1e) to signal that it has installed
keys for reading and writing packets. Receipt of this frame in a packet
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both. And the clarification is good.


An endpoint sends a KEYS_READY frame (type=0x1e) to signal that it has installed
keys for reading and writing packets. Receipt of this frame in a packet
indicates that all earlier keys can be safely discarded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for dropping the Initial and Handshake keys, I had assumed that we would be sending the frame after all the transmissions have been acknowledged rather than when the new keys become available.

The concern of not waiting for ACKs is that packets would be deemed lost, having negative effect on the congestion control. For example, the proposed design suggests the server to drop Handshake keys when it receives ClientFinished, without sending an ACK for the packet that carried ClientFinished.

Are we fine with that?

The other issue regarding the text is that it does not state that the signal is unilateral. My understanding is that an endpoint is expected to drop the keys when both of the following conditions are met: 1) the endpoint has received the done frame, 2) the endpoint is fine with dropping the keys. The text seems to not capture the second condition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had some discussions about losing packets during the handshake due to reordering, and I think that we just have to make some allowances in the congestion controller/loss recovery for that possibility. For packets that we abandon like this, we lose the positive signal (i.e., congestion window increase), but we don't necessarily create a negative signal, because abandonment means that you don't declare them lost.

I'll tweak the text about the dropping of keys requiring both sending and receiving of the frame.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the receiving only is a hangup from before I added the bit about delaying transmission of the frame in order to limit the number of active keys. If you don't have that, receipt is sufficient because receiving the frame always implies that you need to send a matching frame.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we just have to make some allowances in the congestion controller/loss recovery for that possibility. For packets that we abandon like this, we lose the positive signal (i.e., congestion window increase), but we don't necessarily create a negative signal, because abandonment means that you don't declare them lost.

I think you are correct in pointing out that there's no negative signal even though we are losing a positive signal. As implied in my previous comment, my preference goes to delaying dropping keys until all the transmissions for that epoch is complete because that allows us to decouple loss-recovery / congestion control and epochs (FWIW, we will not bee seeing ACKs for ServerHello and ClientFinished). But I would not argue strongly, because there are ways to compensate for the missing signal.

I'll tweak the text about the dropping of keys requiring both sending and receiving of the frame.

👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to delay discarding Initial keys, because we're would lose the DoS fix. Handshake keys are probably ok, but it is easier to draw the line between handshake and key update.

I don't expect that loss and reordering will be that significant, nor will the loss of positive signals. Any missing packets should be small (just an ACK probably), so the reduced increase to congestion allowance would be negligible.

We could add text to Section 6.2.2 of the recovery draft to this effect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider losses of handshake packets as not triggering congestion window reductions. This is perhaps a bit aggressive, but I would argue that we shouldn't have a lot of these packets anyways, and there are reasons beyond congestion that could cause these packets to be marked as lost.

I think we should go with the cleanest mechanism to drop keys -- this has been tricky to get right -- and we'll fix recovery around that. Once the other mechanisms are set, we can figure out how to best do recovery around them.

Copy link
Member

@kazuho kazuho Feb 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the most architecturally sane way would be to transmit all necessary ACKs for every epoch, and I am not sure how much I buy the argument that dropping Initial keys earlier is a win, considering that there would be 1-RTT injection window anyways. A middlebox that observes an Initial can send CONNECTION_CLOSE to both endpoints (as they do for TCP) to successfully disrupt a connection, regardless of when we drop the Initial key. However I also agree that losing the ability to see the ACK is not a big deal. So maybe it's nothing more than a matter of taste.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer the immediate discarding of keys and implicit ACKs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kazuho, but implementations could quite reasonably be skeptical of packets with CONNECTION_CLOSE and wait to see if others arrive. Injecting a fictitious Initial packet with a broken/different handshake is a much more effective attack. It's true that 1 RTT is probably plenty of time for an attacker, but I'd prefer to minimize the window anyway.

Section 4.10 of {{QUIC-TLS}}) along with any loss recovery and congestion
control state (see Sections 5.3.1.2 and 6.9 of {{QUIC-RECOVERY}}).
Endpoints cease both sending and processing Initial packets when it both sends
and receives a Handshake packet containing a KEYS_ACTIVE frame. Though packets
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should say "sends and receives a Handshake packet or a 1-RTT packet containing a KEYS_ACTIVE frame", assuming that we wouldn't be waiting for ACKs when dropping the keys that are to carry the ACKs.

This is because a Handshake packet carrying just the KEYS_ACTIVE frame can be dropped, while all other packets reach the peers.

Copy link
Contributor

@huitema huitema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits. We need a bit more clarity on the sending of KEY ACTIVE during the handshake, but otherwise this is good.

without necessarily needing to receive the first packet that triggered the
change. An endpoint that notices a changed KEY_PHASE bit can update keys and
decrypt the packet that contains the changed bit.
The Key Phase bit is used to indicate which packet protection keys are used to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused. Line 866 mentions a Key Update bit, and here we see a Key Phase bit. Is that the same bit, or do we now have 2 bits?

@@ -863,7 +863,7 @@ This protection applies to the least-significant bits of the first byte, plus
the Packet Number field. The four least-significant bits of the first byte are
protected for packets with long headers; the five least significant bits of the
first byte are protected for packets with short headers. For both header forms,
this covers the reserved bits and the Packet Number Length field; the Key Phase
this covers the reserved bits and the Packet Number Length field; the Key Update
bit is also protected for packets with a short header.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments in Key Update section. I would vote for keeping the Key Phase name.

Once an endpoint has successfully processed a packet with the same key phase, it
can send a KEYS_ACTIVE frame. Endpoints MAY defer sending a KEYS_ACTIVE frame
after a key update (see {{key-update-old-keys}}).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that "processed a packet" means "received, successfully decrypted, and processed the content of the frames." But I have to guess. Encrypting and sending is a form of processing too. The term first occurs in section 8.1, and it is a bit puzzling there too. I would suggest adding a formal definition in "1.2. Terms and Definitions ".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of that text is going to be specified again in the following paragraphs. Do we need a paraphrase here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial to Handshake, and Handshake to 1-RTT. Are these special cases of Key Update?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a note about the handshake to clear that up. Logically, they are the same, but there is nothing to be gained by delaying the signal, or by requiring that the signal be used before a key change can be made.

{{key-update-initiate}}. An endpoint that responds to a key update MUST send a
KEYS_ACTIVE frame to indicate that it is both sending and receiving with updated
keys, though it MAY defer sending the frame (see {{key-update-old-keys}}).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that a restatement of the lines 1167-1169? Why do we have the same text twice, with slightly different phrasing, "processed" vs. "packet protection is successfully removed" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No difference intended. I've tried to even this out.

keys to process delayed packets rather than enabling a new key update. This
only applies to key updates that use the Key Phase bit; endpoints MUST NOT defer
sending of KEYS_ACTIVE during and immediately after the handshake.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"key updates that use the Key Phase bit". That's a weird way to put it. Maybe a reference to the ladder diagram in the transport spec?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "after the handshake is complete," assuming that's now a defined term?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried very hard to separate key update as a concept from the key changes that happen during the handshake. So I'll remove the 'that use the Key Phase bit' part.

@@ -1259,15 +1259,13 @@ Client Server
Initial[0]: CRYPTO[CH] ->

Initial[0]: CRYPTO[SH] ACK[0]
Handshake[0]: CRYPTO[EE, CERT, CV, FIN]
Handshake[0]: KEYS_ACTIVE, CRYPTO[EE, CERT, CV, FIN]
<- 1-RTT[0]: STREAM[1, "..."]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending the KEY ACTIVE there seems wrong. The logic "I have processed a packet at this key level". That has not happened yet, will only happen at the sending of <- Handshake[1]: ACK[0]. That's the same reason why you correctly do not place a Key Active frame in the packet <- 1-RTT[0]: STREAM[1, "..."]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the logic to send a KEYS_ACTIVE frame is that you have both read and write keys available for use. Not that you have used them. For that reason, the server doesn't send the KEYS_ACTIVE for 1-RTT, but it does send it for Handshake.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the logic. I had to tweak the description of the frame to account for that though...

@@ -1283,15 +1281,13 @@ Initial[0]: CRYPTO[CH]
0-RTT[0]: STREAM[0, "..."] ->

Initial[0]: CRYPTO[SH] ACK[0]
Handshake[0] CRYPTO[EE, CERT, CV, FIN]
Handshake[0]: KEYS_ACTIVE, CRYPTO[EE, CERT, CV, FIN]
<- 1-RTT[0]: STREAM[1, "..."] ACK[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue. You cannot send Key Active there because the server has not yet received an Handshake packet from the peer.

@@ -3520,7 +3520,7 @@ carries ACKs in either direction.

~~~
+-+-+-+-+-+-+-+-+
|1|1| 0 |R R|P P|
|1|1| 0 | R | P |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change. Makes the PR harder to read.

packets might still be in flight or awaiting acknowledgment, no further Initial
packets need to be exchanged beyond this point. Initial packet protection keys
are discarded (see Section 4.10 of {{QUIC-TLS}}) along with any loss recovery
and congestion control state (see Sections 5.3.1.2 and 6.9 of {{QUIC-RECOVERY}}).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server side specification feels wrong. The server does not know that the peer has received the server initial until it receives the KEY ACTIVE (Handshake) from the client. Yes, there is an "implicit" dependency that the Handshake cannot be processed without receiving first the Initial packet. There will be cases where Initial is lost, and both Initial and Handshake have to be retransmitted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The KEYS_ACTIVE is sent when both read and write keys are ready, but you don't throw away the previous keys until you have sent and received a KEYS_ACTIVE. I'm unsure of what the problem is here. If you receive a KEYS_ACTIVE then the peer must have received all the necessary packets of yours sent with the previous key.

@@ -3843,7 +3847,7 @@ short packet header.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+
|0|1|S|R|R|K|P P|
|0|1|S| R |K| P |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change.

Copy link
Contributor

@MikeBishop MikeBishop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a less-TLS-steeped reader; take the comments for what they're worth. :-)

keys are available to both peers before another can be initiated.

Once an endpoint has successfully processed a packet with the same key phase, it
can send a KEYS_ACTIVE frame. Endpoints MAY defer sending a KEYS_ACTIVE frame
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"can" feels very weak. Even if they MAY delay sending, aren't they required to eventually send it?

Endpoints responding to an apparent key update MUST NOT generate a timing
side-channel signal that might indicate that the Key Phase bit was invalid (see
{{header-protect-analysis}}). Endpoints can use dummy packet protection keys in
place of discarded keys when key updates are not permitted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to mention this last, it's probably worth expanding on this a bit. What behavior might generate such a side-channel? How are dummy keys used to emulate normal behavior in those cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Text added.

keys to process delayed packets rather than enabling a new key update. This
only applies to key updates that use the Key Phase bit; endpoints MUST NOT defer
sending of KEYS_ACTIVE during and immediately after the handshake.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "after the handshake is complete," assuming that's now a defined term?


Even if old keys are available, those keys MUST NOT be used to protect packets
with packets that have higher packet numbers than packets that were protected
with newer keys. An endpoint that successfully removes protection with old keys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to mandate dropping write keys immediately, but permit retaining read keys?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to mandate dropping keys so directly, but I will mention that it is possible. Note that there are cases during the handshake where write keys for different packet number spaces need to be kept. So this would be limited to key update.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? It seems appropriate to close the connection here.

@huitema
Copy link
Contributor

huitema commented Feb 13, 2019

I left a comment on the mailing list. The more I look at it, the more I think that we should not invent a special mechanism, but simply reuse PATH_CHALLENGE/PATH_RESPONSE. What we are doing is effectively a continuity test for a set of addresses and a key. The challenge/response approach would bring clarity to the "acknowledge or not" issue.

Of course, there is downside to reusing path continuity for testing key continuity. If we want to decouple, then we could create KEY_CHALLENGE/KEY_RESPONSE frames that parallel for keys what the existing challenges to for paths.

Copy link
Contributor

@janaiyengar janaiyengar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some heavy lifting, thank you for doing it! A few comments.

without necessarily needing to receive the first packet that triggered the
change. An endpoint that notices a changed KEY_PHASE bit can update keys and
decrypt the packet that contains the changed bit.
The Key Phase bit is used to indicate which packet protection keys are used to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The Key Phase bit is used to indicate which packet protection keys are used to
The Key Phase bit indicates which packet protection keys are being used to

decrypt the packet that contains the changed bit.
The Key Phase bit is used to indicate which packet protection keys are used to
protect the packet. The Key Phase bit is initially set to 0 for the first set
of 1-RTT packets. The Key Phase bit is toggled to signal each key update.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
of 1-RTT packets. The Key Phase bit is toggled to signal each key update.
of 1-RTT packets and toggled for each subsequent key.

change. An endpoint that notices a changed KEY_PHASE bit can update keys and
decrypt the packet that contains the changed bit.
The Key Phase bit is used to indicate which packet protection keys are used to
protect the packet. The Key Phase bit is initially set to 0 for the first set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
protect the packet. The Key Phase bit is initially set to 0 for the first set
protect the packet. The Key Phase bit is set to 0 for the first set

corresponding key and IV are created from that secret as defined in
{{protection-keys}}. The header protection key is not updated.

The endpoint toggles the value of the Key Phase bit, and uses the updated key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The endpoint toggles the value of the Key Phase bit, and uses the updated key
The endpoint toggles the value of the Key Phase bit and uses the updated key

<--------
... Update to @N
@N [1] QUIC Packets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing. The KEYS_ACTIVE sent above are for @m, right? If so, shouldn't the endpoint send a KEYS_ACTIVE for @n as soon as it updates to the new keys (on line 1133)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During a key update, the initiator doesn't update until it receives a packet in the new keys. If we sent the frame in the first packet we'd get the simultaneous update problem.

That makes me a little more receptive to the idea that @kazuho suggests on the mailing list, but not much more. Having special rules for Handshake isn't a huge burden.

p.s., I'm sure that github users @M and @N get pinged all the time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poor Matt Mullenweg.... 😁

containing a KEYS_ACTIVE frame is lost.

Endpoints MUST NOT send KEYS_ACTIVE frames in Initial or 0-RTT packets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move these last 3 paras to a new subsection called "Key Updates During the Handshake" in the "Key Update" section above.

endpoint sends this frame in its first Handshake packet. Once received, an
endpoint can discard Initial keys.

A KEYS_ACTIVE frame used after the completion of the handshake in 1-RTT packets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A KEYS_ACTIVE frame used after the completion of the handshake in 1-RTT packets
A KEYS_ACTIVE frame used in 1-RTT packets immediately after the completion of the handshake

(not any time after handshake but immediately after)

endpoint can discard Initial keys.

A KEYS_ACTIVE frame used after the completion of the handshake in 1-RTT packets
indicates that Handshake keys are no longer needed. A client sends this frame
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
indicates that Handshake keys are no longer needed. A client sends this frame
indicates that Handshake keys are no longer needed. A client SHOULD send this frame


A KEYS_ACTIVE frame used after the completion of the handshake in 1-RTT packets
indicates that Handshake keys are no longer needed. A client sends this frame
in its first 1-RTT packet, and a server sends this frame in the first packet it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
in its first 1-RTT packet, and a server sends this frame in the first packet it
in its first 1-RTT packet, and a server SHOULD send this frame in the first packet it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going with MUST.

A KEYS_ACTIVE frame used after the completion of the handshake in 1-RTT packets
indicates that Handshake keys are no longer needed. A client sends this frame
in its first 1-RTT packet, and a server sends this frame in the first packet it
sends after completing the handshake. A server might send 1-RTT keys prior to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A server never sends 1-RTT keys. I'm not sure what this sentence is trying to say.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I meant packets.

@martinduke
Copy link
Contributor

I thought we were moving towards key separation between QUIC and TCP/TLS, but not secret separation. Even with key update, we get key separation when using the quic labels for key and iv, and I see no reason to also worry about the label when we update the secret.

@martinthomson
Copy link
Member Author

It is true that we separate keys, and not secrets, but the change to the update label is a more pragmatic one. You don't want to have a situation where TLS key updates and QUIC key updates end up mixed. Yes, you can't trigger TLS key updates in QUIC, but how else would you get the TLS secrets to roll over.

Mostly, that change was to ensure that the process for updating secrets was documented. When reading the current drafts, I challenge you to find where it says what to do.

@martinduke
Copy link
Contributor

Section 6 of quic-tls refers me to 7.2 of RFC 8446. I read and implemented this (literally) last week and found this to be totally unambiguous, as other sections of quic-tls clearly indicate the labels used to derive key and iv. Perhaps others found it less clear.

I am not sure how TCP and QUIC would be mixed. Key update doesn't change resumption secrets, and if TCP and QUIC are sharing the same TLS instance (?) you're going to have a problem with or without this change.

@martinthomson
Copy link
Member Author

The technical bits of this are superceded by #2673, but I want to keep some of these editorial changes.

martinthomson added a commit that referenced this pull request Sep 19, 2019
This makes some significant editorial changes to the key update section,
hopefully making the various activities clearer and more explicit.

In the process, I am also trying to address #2792, which is the timing
sidechannel created by having the generation of updated keys inline with
packet processing.  In doing so, I'm suggesting that endpoints create
the next keys at some time after the key update happens.  Right now, I'm
thinking 1-2 PTOs is probably close enough to workable.  I've limited
this at 3PTO.  This is, however, just a (firm) suggestion at this stage.

For endpoints that only want to keep 2 sets of keys, this is probably
the right time frame, especially if we keep the current advice for 3PTO
before a subsequent update.

The effect of this is that attempts to update at certain times could
cause all packets after the update to be discarded.  That would only
happen if implementations consistently ignored advice on update
frequency, so I think that's tolerable, but I'd like input on this.

(This also attempts to take up the advice from the other, older PRs on
this subject.)

Closes #2792, #2791, #2237.
@martinthomson
Copy link
Member Author

OBE

@martinthomson martinthomson deleted the simultaneous-update branch December 3, 2019 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-tls design An issue that affects the design of the protocol; resolution requires consensus.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

simultaneous key update can lead to deadlock
10 participants